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Abstract 

It is known that both quantum and classical cellular automata (CA) exist that 
are computationally universal in the sense that they can simulate, after appropriate 
initialization, any quantum or classical computation, respectively. Here we introduce 
a different notion of universality: a CA is called physically universal if every transfor- 
mation on any finite region can be (approximately) implemented by the autonomous 
time evolution of the system after the complement of the region has been initialized in 
an appropriate way. We pose the question of whether physically universal CAs exist. 

Such CAs would provide a model of the world where the boundary between a phys- 
ical system and its controller can be consistently shifted, in analogy to the Heisenberg 
cut for the quantum measurement problem. We propose to study the thermodynamic 
cost of computation and control within such a model because implementing a cyclic 
process on a microsystem may require a non-cyclic process for its controller, whereas 
implementing a cyclic process on system and controller may require the implementa- 
tion of a non-cyclic process on a "meta" -controller, and so on. Physically universal 
CAs avoid this infinite hierarchy of controllers and the cost of implementing cycles on 
a subsystem can be described by mixing properties of the CA dynamics. 

We define a physical prior on the CA configurations by applying the dynamics to 
an initial state where half of the CA is in the maximum entropy state and half of it 
is in the all-zero state (thus reflecting the fact that life requires non-equilibrium states 
like the boundary between a hold and a cold reservoir). As opposed to Solomonoff's 
prior, our prior does not only account for the Kolmogorov complexity but also for the 
cost of isolating the system during the state preparation if the preparation process is 
not robust. 

The main goal of this article is to formally state several open problems and sketch 
their relevance for the foundations of physics rather than providing results. 
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1 Towards a physical theory of control 



In the abstract framework of both quantum theory and classical physics, the following 
concepts play a crucial role: (1) states (2) dynamical evolution (3) measurements (4) 
system composition and (5) restriction of the state of a composed system to one of its 
components. In quantum theory, states are given by density operators (e.g. positive 
operators with trace one) on the system Hilbert space T-L, the dynamical evolution 
is described by a semi- group of completely positive trace-preserving maps, measure- 
ments are described by positive-operator- valued measures, and system composition is 
described by tensor products of Hilbert spaces [Il[2l[3]. Finally, partial traces define 
system restriction. 

In classical physics, the states are probability distributions on a phase space, the dy- 
namics is given by a semi-group of stochastic maps, system composition is given by the 
cartesian product of the phase spaces, and state restriction is given by marginalization 
of probability measures. 

Having such a framework for the physical world raises the question to what extent 
the formalism also contains states, dynamical evolutions, and measurements that do 
not correspond to any physically possible situation or process. Restricting the attention 
to quantum theory, these questions thus read: (1) Is every density operator on Ti a 
physically possible state, (2) is every completely positive trace-preserving operation a 
process that can be implemented in nature, (3) is there a measurement procedure for 
every POVM? 

First we describe in what sense modern quantum computing (QC) research ^ has 
given an affirmative answer to all these questions and in what sense it has not. To this 
end, we first rephrase some terminology of QC. A quantum-bit (qubit) is a quantum 
system with Hilbert space C^, a quantum register is a collection of n qubit^ Re- 
searchers have described various physical systems having a quantum degree of freedom 
for which two states are universally controllable in the following sense: Any unitary 
operation on ("single qubit gate") can be performed by appropriate operations on 
the system. Moreover, they have described how to implement controlled interactions 
between pairs of qubits, thus implementing a unitary on the Hilbert space H := C^i^C^ 
that is not a product of single qubit operations (hence a proper "two-qubit gate" ) . It 
was then shown that sequences of one- and two-qubit gates are sufficient for implement- 
ing arbitrary unitary operations up to any desired precision [1]. Being able to prepare 
one pure state of the quantum register thus enables the preparation of any pure state. 
Moreover, measurements with respect to any measurement basis can be reduced to 
measurements with respect to a single reference basis by first transforming the state 
to the latter basis via a unitary transformation. Preparations of mixed states, imple- 
mentation of general completely positive trace-preserving maps and measurements for 
general POVMs can be obtained by restriction of states to a subsystem. In this sense, 
questions (l)-(3) seem to be answered with "yes". Then, the operational meaning of 
some multi-qubit states, dynamical evolutions, observables are only limited by the fact 
that the implementation time could even exceed the life-time of the univers ejg There 

"'^It should be noted that the restriction to two-dimensional systems is only a matter of convention. 
^For a complexity theory of states and observables see e.g. jHl [Zl [5] 
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are, however, two other reasons why QC did not answer our questions in the sense 
intended here. 

First, the quantum mechanical degrees of freedom defining the qubits in existing 
proposals for QC [3J are only a small part of the entire degrees of freedom of physical 
particles (e.g. the nuclear spin of a particle or it may be two levels in an internal degree 
of freedom of a trapped ion). So far, it has not been claimed that all the degrees of 
freedom of such a particle would be controllable simultaneously. 

The second reason why we are not satisfied with the answer given by QC is that we 
would like to see a theoretical model of quantum control that treats the controller as 
the same type of physical system as the system to be controlled. Within such a unifying 
model - as proposed by the present article - we are able to explore the conditions under 
which one system acts as controller of the other, even though a physical interaction 
can send information in both directions]^ Moreover, the question of how to control 
the controller then shifts the problem of how to control the system to the problem of 
how to control the controller by a "meta-controller" , leading to an infinite hierarchy of 
controllers. 

Remarkably, the same shift between system and its interface is generally accepted 
for the quantum measurement problem: Once the quantum measurement process is 
described by an interaction between system and the measurement apparatus, the ques- 
tion occurs "who measures the measurement apparatus?", which leads to the same 
chain of measurement instruments as we have stated for the controller problem above. 
For the measurement apparatus, it has been argued that the cut between system and 
measurement instrument is arbitrary, the description must remain consistent if the 
boundary is shifted. Likewise, we argue that quantum control has a consistent descrip- 
tion if one can show that the cut between system and controller can be shifted. In 
[9] , we have already described a toy model of quantum control with a fixed interaction 
between controller and system, where operations on the system are implemented by 
implementing transformations on the controller. In the present paper, we assume that 
we are only able to implement state preparations on the controller. We first state on 
an abstract level what we would consider a consistent model of physical control, before 
it will be made precise within the setting of cellular automata (CA): 

Definition 1 (model of physical control, abstract version ) 

Let (at) with tGMortGljbea group describing the dynamical evolution on state 
space of the world W . Then every mathematically possible operation on the physical 
state space of some region R can be implemented by initializing the complement W\R of 
the region to an appropriate state and waiting until at implements the desired operation. 

To motivate Definition [T| we first consider an arbitrary experimental setup that 
is able to implement one particular control operation. The control operation may, 
for instance, be to change the quantum state of a few ions in an ion trap in some 
desired way. To this end, some sophisticated sequence of Laser pulses is applied to the 

^Note that unidirectionality of causal influence not only occurs if the controller is significantly larger than 
the system to be controlled. Instead, it is also a matter of the state of the controller. For such toy models 
of quantum control see e.g., [Sllin]; Refs. [TTJ[T2] discuss thermodynamic aspects of unidirectionality. 
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system. Assume that the pulses are controhed by a computer program so that there 
is no need for the experimentahst to intervene once the program runs. We can then 
consider computer, Laser and the ions in the trap as a big physical system on which the 
global dynamics of the world acts. Obviously, the computer software controlling this 
process is just a physical state of the computer. However, here we want to go further 
and also consider the presence or absence of the hardware of the experimental setup 
merely as different states of a larger system (which implicitly refers to a field-theoretic 
point of view). From such a perspective, there is no distinction between hardware and 
software in the experimental setup and the whole control operation on the system to 
be controlled (the ions) is implemented by changing the physical state of the system's 
environment. 

Following, for instance, \13\ [T^ [T5] we will consider cellular automata (CA) as in- 
teresting models of the world and therefore study our problem in the context of CAs. 
Our main focus (Section [2]) will be on classical CAs since the problem seems to be 
non-trivial even in the classical regime. Apart from describing possible definitions of 
physical universality (Subsection 2.1), we discuss some relations between physical uni- 



versality to ergodic properties of CAs in Subsection 2.2, Subsections 2.3 and 2.4 argues 



why physically universal CAs are helpful for studying limits of control and thermody- 
namic laws from a new perspective. Subsection 2.5 proposes a prior distribution for 
physical states based on physically universal CAs. To this end, we consider an initial 
state of the CA where half of the cells are set to zero and the other half are in the max- 
imum entropy state (thus modelling a hot and a cold part of the universe). Section [s] 
briefly discusses physical universality for quantum CAs (Subsection [s]) and physically 
universal Hamiltonians as their continuous analog (Subsection 3.3), where controllabil- 
ity also implies the ability to control the preparation of quantum superpositions by a 
classical program. In the context of physically universal Hamiltonians, the terms "hot" 
and "cold" part of the universe can be taken more literally because they really refer to 
Gibbs states. This makes the physical interpretation of the prior more obvious. 

The main contribution of this article is to raise the question of how to define the 
right framework for a physical control theory that also treats the controller as an object 
internal to the theory. In posing this question, the paper sketches the possible impact 
of such a framework, but it will not present any deep results on cellular automata. 



2 Physically universal classical cellular automata 
2.1 Possible options for defining physical universality 

We first introduce some terminology and notation for classical cellular automata (CAs). 
Let L := Z"^ for some d G N be a d-dimensional lattice and A be an alphabet of states 
of a single cell (without loss of generality, let one of the symbols be "0" ) . The space of 
pure states of the CA is given by 

S :=A^. 

The space of mixed states is given by probability distributions on S. The maximally 
mixed state (maximum entropy state) is given by the uniform distribution over S, i.e.. 
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the infinite product of uniform distributions over A. For every state s E 5" and every 
subset R C L the restriction of s to R, denoted by s\ji is defined by the substring 
s' G corresponding to R. A region will be a subset R £ L. Usually, our regions will 
be finite subsets unless we state the opposite. 
By slightly abusing notation, we set 

for any x € L. A configuration of a region i? is a string c G A^. It defines, in a 
straightforward way, the cylinder set 

{s£A^\ s\r = c}, 

which will also be denoted by c whenever this causes no confusion. The entropy of R 
in the mixed state is given by the Shannon entropy of the restriction of to R, i.e., 

The time evolution {at)t(^z of a CA is a group (by assuming the group property we 
implicitly restrict the attention to reversible CAs) of translation covariant maps 

at : S ^ S , 

that is local in the sense that a±i{s)\x only depend on the state of the cells lying in 
some neighborhood of x. Here we consider the Moore neighborhood of radius one, i.e., 
all cells y with \\y — x\\oo < 1 |16j . 

By slightly overloading notation, we also write at{c)\R if c G A^ is the configuration 
of any region R' that contains all cells relevant for determining the state of R at time 
t (which is, for instance, the case if R' contains the Moore neighborhood of R with 
radius t). If a configuration c G A^ is defined via c := (ci,C2) with ci G A^^ and 
C2 G A^'^ for R = RiU R2, we write at{ci, C2)\r instead of at((ci, C2))\r- 

The following definition formalizes the weakest form among all notions of physical 
universality that we define. It is the ability to change the state of a region R by 
initializing the complement of R in an appropriate way: 

Definition 2 (conditional state preparation) 

A CA is said to allow for conditional state preparation if for every region R C L and 
every pair (ci,Cf) of initial and final configurations of R there exists a configuration 
e G A^\^ and a time t G No such that 

at{e,Ci)\R = Of . 

Less formally speaking, the dynamics prepares the final state Cf £ A^ after the time t, 
given that the environment started in the state e and the region in the state Ci. 

Note that the state e can be chosen differently for every initial state q. The following 
notion of state preparation is stronger since it demands the existence of a state e that 
works for every initial configuration q: 
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Definition 3 (unconditional state preparation) 

A CA is said to allow for unconditional state preparation if for every finite subset 
R d L of cells and configurations Cf G A^, there exists a configuration in e € A^'^^ 
and a time t gN such that 

a.t{e,Ci)\R = Cf , 

for every configuration Ci G A^. 

Less formally speaking, the dynamics prepares the state Cf in the region R by ini- 
tializing the complement of R to e, regardless of the initial state Ci of R. 

It seems that Definition [2] already formalizes a sufficiently strong property because one 
could prepare the environment after having read out the initial state Cj of the region 
R. However, the entire process of readout and conditioning the initialization of the 
complement of R on the state q should also be implemented by the physical laws that 
govern the dynamics of the world. Therefore, we consider the latter definition as the 
better notion of universal state preparation. Nevertheless, the following example shows 
that Definition [3] is a rather weak notion of universality since it is already satisfied by 
a simple shift: 

Example 1 (shift) 

For X £ L let ai be given by shifting the state by the vector x, i.e., 

ai{s)\i := s\i-x Ws e A^ . 

For some finite region R C 1^, let Cf G A^ be an arbitrary configuration. Then cj can 
be prepared as follows. Choose some to such that 

{R + xto)nR = 9. 

Initialize the region R' := R — xto to the translated copy of Cf. Then the region R is 
obviously in the configuration Cf at time to. 

If d = 1 and x = 1, the dynamics shifts the state of each site by one. Then the 
corresponding MDS is known as Bernoulli shift. 

However, such a trivial model of dynamical evolution is unacceptable as a model 
for universal control. One reason is that it lacks computation power. We could ask for 
models that are computationally universal and allow for universal state preparation in 
the sense of Definitions [2] or [3| Rather than postulating computational power a priori, 
we prefer demanding that the model allows for non-trivial operations other than state 
preparation. The following condition includes conditional state preparation and is 
obviously not satisfied for the shift dynamics: 

Definition 4 (universal implementation of bijections) 

A CA is said to allow for universal implementation of bijections if for every finite 
region R C L and every bijective map 

vr : ^« ^ 

there is a configuration e G A^^^ of the complement of R and a time t such that 

at{e,c)\B = -k{c) ycGA^. 
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Note that the abihty of implementing bijections impUes the abihty of implementing 
measurements in the following sense: apart from a region R whose state should be 
measured, define a region Rm which serves as a measurement aparatus. One can then 
implement a bijection tt on U Rm that chnages the state of Rm depending on the 
state of R. 

One of the main goal of this paper is to formulate the following open problem: 
Question 1 (existence of physically universal CA) 

Is there a classical CA that is physically universal in the sense of Definition^? 

It is easy to see that non-bijective maps vr can be implemented by restricting bijec- 
tions to smaller regions. For this reason, the bijectivity assumption in Definition |4] is 
irrelevant and it is a matter of taste whether one wants to keep it in the definition. 

In case the answer to this question is negative, one should try to find a weaker sense 
of universal controllability. An affirmative answer, on the other hand, raises further 
questions since physically universal CAs are good candidates for studying thermody- 
namic cost of computation and (quantum) control from a new perspective. Some ideas 
on that will be presented in Subsection |2.4[ 

We will not formulate any conjecture regarding the solution of Question [T| but 



Subsection 2.3 will show that the controllability of the controller of a system imposes 



limitations on the controllability of the system itself. 



2.2 Some relations between physical universality and er- 
godic properties 

We want to discuss relations between physical universality and ergodicity of dynamical 
systems. To this end, we introduce the following terminology [T7]: 

Definition 5 (measure-preserving dynamical systems (MDS)) 

Let (ri, S,/u) be a measure space where Q is a set, S the a-algebra of measurable subsets 
of and fx a measure with /^(f^) < oo. Let (p : Q ^ Q be a measurable map with 
fi{(t)^^{B)) = /i(i?) for every measurable set B. Then /j,, cj)) is called a measure- 

preserving dynamical system (MDS). 

Then we have: 



Lemma 1 (CA is an MDS) 

Every reversible CA as defined above is a measure-preserving dynamical system where 
Q, := S, T, is generated by the set of cylinder sets, ji is the product of uniform probability 
distributions on A and (p := ai. 



Proof: fi{a^^{B)) = fi{B) can easily be checked for every cylinder set B. Since the lat- 
ter ones generate the entire sigma algebra of measurable sets, conservation of measure 
follows. □ 

The following terminology will be useful \18\ [T7] : 
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Definition 6 (ergodicity) 

An MDS is called ergodic if(t)~^{B) = B implies B = n or B = $ for all B e T,, where 
= denotes equality up to sets of measure zero. Equivalently, (t)'^{B) ^ B can also he 
replaced with (j)~^{B) <Z B or 4>~^{B) D B (up to sets of measure zero). 

We will also need another equivalent formulations of ergodicity [T8]: 

Lemma 2 (different characterization of ergodicity) 

An MDS is ergodic if and only if for every B, D £ T, there is an t gN such that 

^-\B)r\D^%. 

Then we have: 

Theorem 1 (state preparation in ergodic CAs) 

// a CA is an ergodic MDS, it allows for conditional state preparation in the sense of 
Definition 

Proof: Let Bi <Z S and Bj C S he the cylinder sets corresponding to the initial and 
the final configuration Cj and cj of i?, respectively. Then there is a t such that 

at\Bf)nBi^$. 

Choose c G a^^{Bf) D Bi. Since c is an element of Bi, it is of the form c = (e, Cj). On 
the other hand, at{e,Ci)\ji = Cf because c G a~[^{Bf). □ 

Ergodicity of CAs has already been studied in the literatur^ [201 EI], but the fact 
that the Bernoulli shift (Example [T]) is ergodic [TT] shows that even ergodicity does not 
imply physical universality in the sense of Definition [4} 

2.3 Limits of controllability 

Being able to prepare a certain state, one may also wish to keep it at least for some 
time. In the context of quantum information processing, for instance, it is considered 
as an important problem to prevent a quantum state from decaying too quickly (where 
decay can be understood in the sense of both decoherence or relaxation). To ensure this, 
one tries to isolate the system as much as possible from influences of the environment. 
On the other hand, implementing control operations requires interactions with the 
environment. We expect that this conflict between protecting the state by isolating 
the system and nevertheless still being able to access it, can be nicely explored in the 
setting of physically universal CAs. Then, isolating the system only means to prepare 
the environment into a state that effectively turns off the interaction. The question 
of whether this conflict implies serious restrictions to physical universality will mainly 
be unanswered, but we mention some small observations that may suggest a future 
direction for research. The following statement, for instance, is almost obvious, but 
we phrase it as a theorem because it shows that too strong controllability assumptions 
are self-contradictory: 

^Note that [TO] studies ergodic quantum CAs, but not in the sense of MDS. Instead, ergodicity is meant 
in the sense of a topological dynamics having a unique invariant state. 
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Theorem 2 (some configurations are unstable) 

Let R he a region that includes at least the Moore neighborhood of one cell x. Let at he 
physically universal in the sense of Definition then there is a configuration c G 
such that 

ai(e,c)U/c VeeA^\^. 

Proof: If the dynamics of the CA is non-trivial (which is certainly the case for physically 
universal CAs) there must be a configuration c G such that 

OLl{c)\x / C\x . 

Hence, 

ai{e,c)\x 7^ Cx 

for all e € A^\^. □ 

The following result is only slightly less straightforward, but it already illustrates how 
controllability of the controller of a region R restricts the controllability of R: 

Theorem 3 (no configuration lasts forever) 

Given a CA that is physically universal in the sense of Definition^ then it is impossible 
that there exists initial and final configurations Ci,Cf G A^, a finite "program" region 
Rp with initialization Cp, and a time to GN such that 

at{ci,Cp)\R = Cf \/t>to. (1) 

Proof: Assume that ([T]) is satisfied. Set R' := RU Rp and choose a vector x G L such 
that (R' + x)r]R' = and that ||x||oo > to. Let /? be the transformation on R'u{R' + x) 
that swaps the state between R' and R' + x. By physical universality in the sense of 
Definition |4| there is a configuration of the complement of R' U (R' + x) such that at^ 
implements (3 for some ti. After the implementation of /3, the region R is only in the 
state c/ if the initial state of the region R + x has been the shifted copy of c/. Hence, 
ti must be smaller than to since ([T]) states that the state of -R is c/ regardless of the 
state of R' + X (note that R' + x is part of the complement of Rp by assumption and 
its state is thus irrelevant for ([l])). On the other hand, the implementation of the swap 
P requires at least the time to since the information can propagate one cell per time 
step only, which leads to a contradiction. □ 

Theorem [3] shows that initializing a finite region can never prepare a state that lasts 
forever. If possible at all, it requires an infinite region. To show more powerful results 
about control tasks that are self-contradictory has to be left to the future (in this 
context it may also be worth mentioning Ref. [22j which describes some impossibility 
results for inference tasks instead of control tasks within a computation model of the 
world and relate them to the Halting problem). 
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2.4 Space and energy requirements of computations and 
control operations 

In this section we want to mention some potential implications for the resource re- 
quirements of computation processes, given that physically universal CAs define a 
reasonable model of the world. Even though we have proved only a few results on 
this, the following high-level arguments motivate why physically universal CAs shed a 
different light on thermodynamics. 

1. The thermodynamic cost of isolating systems: the difficulty of isolating physi- 
cal objects from its environment is one of the main obstacles in controlling mi- 
crophysics. In usual quantum control, this appears more or less as a practical 
problem and the question is how to turn off the disturbing interactions. Physical 
universality, however, implies that the system is never isolated and that only ap- 
propriate states of the environment ensure that the system behaves for some time 
period as if it would be isolated. The fact that, in turn, also the environment of 
the system is permanently coupled to its environment (by physical universality) 
implies that this "isolating state" is perturbed after a while. Preparing the envi- 
ronment into a state that effectively isolates the system for a long time, probably 
requires a lot of thermodynamic resources. To discuss these costs, one proba- 
bly needs a model where all interactions are permanently present and cannot be 
turned on and off by the experimentalist. Within the framework of physically uni- 
versal CAs it is not only possible to address the requirements of extracting heat 
from a system |23j but also of preventing the heat from reentering the system. 

2. Thermodynamic reversibility: It is commonly assumed that the implementation 
of a bijective transformation of the states of a microscopic system is thermody- 
namically reversible. The fact that the experimental setup controlling the imple- 
mentation generates a lot of heat is usually considered as a problem of current 
technology rather than being a fundamental law of physics. Physically universal 
CAs provide a model that makes it possible to explore how the controller (i.e., 
the region Rp around the region R to be controlled) changes its state during this 
implementation. From the point of view of traditional thermodynamics, this state 
transition is again reversible if it is a bijection of the state space of a microsystem. 
However, inverting this bijection will then change the state of the environment 
around Rp. Then, the question of thermodynamic reversibility leads, again, to 
our infinite sequence of meta-controUers. We will not present any solution to this 
deep problem. We only emphasize that the existence of thermodynamic reversible 
processes is challenged by the ideas above. 

3. Space and energy requirements of computation: In complexity theory, the space 
requirements of a computation is defined as the size of the memory band of a 
Turing machine that is written on during the computation process. The com- 
plexity class PSPACE, for instance, is defined as the class of problems whose 
space requirements increase only polynomial in the size of the input string [24j. 
It is known p5] that appropriate CAs can simulate a universal Turing machine 
efficiently with respect to both space and time resources. 
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In our context, we want to redefine the space requirements of a computation in a 
way tliat is motivated by ideas from tliermodynamics: we do not only count tliose 
cells of the CA that are actively involved in the computation in the sense that 
their state changes during the process. Instead, we count all cells whose state 
matters. In the simplest case, it may be necessary to set a large set of cells to 
some fixed symbol (e.g. to zero) to avoid that these cells disturb the computation 
by influencing the cells involved in the computation. From a purely computer 
scientific point of view, it is natural to study the resources of computation within 
a setting where all the sites are set to zero except for those involved in the 
computation. In our physical model, however, this would correspond to cooling all 
cells down to zero temperature, which requires infinite thermodynamic resources. 
We assume that we can only extract the entropy of a finite region and use this 
free memory space for the computation. In a physically universal CA, we then 
get the problem that this region can never remain free of entropy because the 
interaction that guarantees universality necessarily transfers entropy into the free 
memory space. 

The discussion below tries to support the vague statements above by formal ar- 
guments. We will not always distinguish between computation processes and other 
control processes]^ The following theorem is actually a simple observation, but we 
phrase it as a theorem because it confirms the last sentence of item 3 above: 

Theorem 4 (lower bound on entropy influx) 

Let R be an arbitrary region and v be a probability distribution on S whose restriction 
to L\R is the uniform distribution. Let the CA be universal in the sense of Definition^ 
and X be some vector such that i? n (i? + x) = 0. If Rp denotes a region such that for 
some Cp G A^p the state c of R is transferred to R + x, i.e., 

at{cp,c)\R+x = c yc£A^, 

for some appropriate t, then the entropy of R after the time t is at least 

S{{voat)\n)>^^^\og\A\. 

Proof: For vt := v o at we consider the conditional distribution given aj~^(cp). Its 
restriction to R is the uniform distribution because the initial state Cp triggers the 
implementation of the swap between R + x and R. The entropy of the uniform dis- 
tribution on R reads |i?|log|74|. Since Rp is initially also in the maximum entropy 
mixture, the probability for being in the state Cp is |j4|~I^pL Weighting the entropy 
\R\ log 1^1 with this factor yields the desired bound. □ 

The theorem shows a trade-off between being able to implement bijections and being 
able to isolate a region: if /3 can be easily implemented on RU[R+x) (i.e., by initializing 
a small region Rp) then i? U (i? + x) is badly isolated because we get large entropy 



^On the elementary level of nature, thermodynamic and computation processes are closely related, anyway 



11 



influx. Note that no such statement holds for computationally universal CAs since 
they could have a "death state" that remains forever and turns off all interactions with 
the surrounding cells. A boundary with dead cells could then prevent the memory 
space from getting entropy from its environment. In a physically universal CA, the 
environment is always able to "revitalize" the "dead cells". It is possible that in 
physically universal CAs, the region that needs to be initialized to enable a computation 
process grows proportionally with the computation time. Loosely speaking, the size of 
the region that needs to be initialized is related to the amount of free energy that must 
be available in order to run the computation properly. This is because Landauer's 
principle [271 123 ES] states that it requires the energy E = kT In 2 to initialize one 
bit. Prom a more accurate point of view, however, we have to account for the fact 
that the region that we must initialize not necessarily needs to be prepared to one 
specific configuration. Instead, it could be that there is a whole set of configurations 
that ensure that the desired computation process works properly. This corresponds to 
a smaller amount of free energy. The following definition formalizes the free energy 
content of configurations: 

Definition 7 (free energy of a set of configurations) 

Let B C he a set of configurations and /i he the uniform distrihution on S (which 
is defined via the product of uniform distributions on each A). Then 

F{B) = -log^fi{B) 

is the free energy required to ensure that the world is in a state s £ B. 

The definition is motivated by the following interpretation of probability distributions. 
The mixed state fi, which is the uniform distribution over all configuration, is thought 
to be the thermodynamic equilibrium of the world, i.e., the analog of the Gibbs state. 
We define its free energy to be zero. In physics, the free energy of a mixed state is, 
up to the factor kT, given by its relative entropy distance from thermal equilibrium 
|29j . Here, mixed states are probability distributions on A^ and the free energy is thus 
(up to constants that we ignore for sake of convenience) given by the relative entropy 
distance from fi, i.e., 

F(/2) := D{fi\\fi) . 

If fl is any distribution with support B, the relative entropy distance to n is minimal 
if jl is the uniform distribution on B. One checks easily that 

Z)(^||^) = -log2M^)- 

Within this setting, we can easily define the free energy needed for a preparation 
process: 

Definition 8 (free energy required for a preparation process) 

Assume a region is in the state Ci G A^ and we want it to be in the state cj at time t. 
Interpreting Ci and cj as cylinder sets, the state of the lattice s G A^\^ must he chosen 
such that 

s G Ci n a^^{cf) , 
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where the right hand side interprets Ci and cj as sets (as defined previously). 
Then, 

F{ci i-> Cf) := -log/i(ci n a^^{cf)) 
is the free energy needed to implement the preparation process Cf after the time t. 

Note that this definition includes the free energy content of Cj which is given by 
\R\ log2 \ A\, since Cj is one configuration in a set of 1^41 1^1 possible ones. 
We also define the free energy required for a computation process: 

Definition 9 (energy requirements for computation) 

Assume that the physical universal CA is only able to perform a desired computation 
process C if the state s of the world lies in the set B C A^ . Then 

F{B) :=-logfi{B) 

is the free energy required for C . 

We will not elaborate on this any further, but consider the thermodynamic costs of 
implementing sequences of state transitions on some region R since this task is easier to 
address than computation tasks. Consider the following sequence of state transitions 

t\ to tri 

Co 1-4 ci 1-4 ci • • • 1-4 c„ , 
and define the corresponding free energy resource requirements by 

- log/i (cQ n a^^ici) n a^^_^^^{c2) n • • • n at"^+.-+t„(cn)) • 
An interesting special instance is to implement k cycles 

Cl l-> C2 1-^ • • • l-> C„ 1-^ Cl 1-^ C2 • • • C„ • • • , (2) 

^ V ' ^ V ' 

1th cycle 2nd cycle 

where the transition from Cj and Cj+i and from Cn to ci is implemented by one time 
step of the CA. We do not know whether physically universal CAs also allow for the 
implementation of arbitrarily many cycles of this form, but given that they do, we have 
the following statement for ergodic CAs: 

Theorem 5 (cost of implementing repeated cycle processes) 

Let ci , . . . , c„ he configurations of a region R such that 

n 

[jc,^A^. (3) 

i=i 

Then, in an ergodic CA, the cost of implementing k cycles of the form ^ converges 
to infinity for k ^ oo. 
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Proof: define B := Uj=i '^j 

k 

Clearly, 

ai{D)cD. (4) 

Due to eq. Js]), we have 7^ 1. Since ai is ergodic, all sets satisfying the invariance 
condition (H| have measure zero or one, hence = 0. Due to 

hm ^i^f]a,{B)] = 0, 



the statement follows. □. 

A weaker task than implementing a cycle is to periodically restore the same config- 
uration c again and again after r time steps, without specifying what happens between 
the r steps: 



r T T 

c I— c I— c I— )■ 



According to Definition [TJ the free energy requirements are given by 

n 

-log/x(f|a7;(c)). (5) 

i=o 

To derive statements on the resources needed, we first recall the following mixing 
property (see [TH], page 38), which is known to imply ergodicity [T7]: 

Definition 10 (v^eakly mixing MDS) 

An MDS is called weakly mixing if 



lim -y^f,{cp-^{B) nD)= , 

j=0 

for all measurable sets B,D. 

The following result of ergodic theory (Corollary 14.15 in [T7]) will be helpful: 
Lemma 3 (mixing of all orders) 

Every weakly mixing MDS is weakly mixing of all orders in the sense that 



n— >oo n 



hm - V ^ (bo n r"(i?i) n r '"(i?2) n • • • n r^'-'^'^iBk-i)) = KBo) ■ ■ ■ KBk-i) , 

(6) 



ra=0 

for all k and every Bq, . . . , -Bfe_i G S. 
We apply this result to our setting and obtain: 
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Theorem 6 (cost of restoring states in weakly mixing CAs) 

Let the CA be weakly mixing and assume that there is a configuration c € for 
which it is possible to implement the following k-fold recurrence 

C^^C^^C^-^■■■^^C, 

^ V ' 

k 

for all T > tq for some tq € N. Let F}^{t) he the free energy required for implementing 
this process. Define the average free energy requirements over all r > tq by 

1 

Fk := liminf — V F,.(t) . (7) 

ri-S>oo Ti — Tn + 1 

T=TO 

Then it satisfies the lower bound 

Fk > -A;log2/i(c) . 
Proof: According to Definition [7| F^ir) reads 

(k-l 
j=0 

Hence, 

Fk := — liminf ■ 

Tlie convexity of tlie logarithm implies 

1 (^^^ \ 

Fk > log2 lim — V ^ n aj^{c) = -log2/u(c)'' = -/c log2 /u(c) , 

^ T = TO \i=0 / 

where the second last equality uses eq. 

Theorem [6] states that the cost of repeatedly restoring the same state k times (after 
r time steps) grows linearly in k when averaged over all r. The physical relevance 
of this statement is speculative for two reasons. First, we do not know whether the 
appropriate mixing properties follow from physical universality. Second, it is unclear 
whether the assumption that the sequence of state transitions can be implemented for 
all T > To is reasonable. We will therefore formulate another open problem: 

Question 2 (thermodynamic cost of cycles) 

Given any desired configuration c, how does the free energy ^ of restoring it again 
and again grow with the number k of cycles? 

In case the energy grows at least linearly in k for physically realistic models, this 
would suggest that implementing cycles on microscopic systems involves an experi- 
mental setup whose energy content grows linearly in the number of cycles. On the 
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one hand, the energy content does not seem to be used up, since it just needs to be 
available. On the other hand, this amount of energy cannot be used to implement the 
next cycles because, if reusing the energy was possible, the amount of energy that needs 
to be present would not grow linearly in k. Note that the above ergodic theory based 
framework avoids exploring the thermodynamic cost of an infinite sequence of con- 
trollers and meta-controllers as sketched in item 2 at the beginning of this subsection 
because the universal CA describes the whole hierarchy of controllers simultaneously. 

2.5 Towards a physical analog of Kolmogorov complexity 
and SolomonofF's prior 

Several authors have already pointed out the physical relevance of algorithmic infor- 
mation ("Kolmogorov complexity"), e.g., [30l[31]. For any binary string {0,1}*, the 
algorithmic information K{s) is defined by the length of the shortest program on a 
universal prefix Turing machine that outputs s and halts then [32l [331 El] • 

The thermodynamic relevance of Kolmogorov complexity has, for instance, been 
emphasized in \35 \ 130 ^ [36], its importance for statistical inference has already been de- 
scribed by Solmonoff [33] , and also the foundation of modern machine learning method- 
ology often refer to Kolmogorov complexity, e.g. [371 EH]- Recently, [39l HOI [H] pos- 
tulated causal inference rules that also use algorithmic information. A crucial concept 
for algorithmic information based inference is Solomonoff 's prior: 

Definition 11 (Solomonoff 's prior) 

Given a universal Turing machine T with prefix coding. Then, for any binary string 
s £ {0, 1}*, one defines m{s) by the probability that T produces the output s and stops 
after every bit of the infinite input tape has been randomly set to or 1 with probability 
1/2 each. 

Note that these random programs do not contain any additional symbol that indi- 
cates the end of the program code. Since the Turing machine uses prefix coding, no 
valid program is the prefix of another one. For this reason, the uniform distribution 
over all binary words (defined by the infinite product of uniform distributions on {0, 1}) 
automatically defines a distribution on the set of valid programs. 

Even though Solmonoff 's prior has shown to be a powerful concept for the foun- 
dation of inference, the following modifications may be appropriate for a prior on the 
states of the physical world: 

1. Symmetries: What prior probability should, for instance, be assigned to the event 
that a next lightening hits the earth at a longitude of 0° (up to an error of e)? 
There is no reason why it should be larger than the probability of hitting the 
earth at 24.35219°, because nature does not care about whether the numerical 
value of the location can be computed by a short program. The physical laws 
that govern lightening fulfill some symmetries that should be respected by our 
prior. To construct a prior that accounts for these symmetries and still captures 
the aspect of description length, we propose to use a computation model that is 
inherently symmetric with respect to some transformations. 



16 



2. Complexity of isolating systems: According to Solomonoff 's prior, any state hav- 
ing a short program as description is hkely to occur in nature, no matter whether 
the running time is large or not and no matter how robust the output is with 
respect to perturbing the state of the Turing machine during the computation. 
Physical prior probability should also account for the robustness of the computa- 
tion process since no system is perfectly isolated from its environment. Physically 
universal CAs are good models to take this into account because the coupling be- 
tween system and its environment is always present by definition. 

We now define a prior via a physically universal CA. A naive analog of randomizing 
the input of the Turing machine would be to initialize the CA to the uniform distri- 
bution over all pure states and then applying the dynamics at, yields a trivial prior 
for every t since our bijective dynamics preserves the uniform distribution. We want 
to define a prior that gives higher probability to simple patterns like 0^ (all cells in R 
are in the state 0). It will therefore be based on the following initial state: 

Definition 12 (initial state of the universe) 

Let L = U L_ be a partition of the lattice into two infinite subsets (L^ could, for 
instance, be all cells with xi > 0). Define a probability measure Q by setting all sites 
in L- to zero and choosing the uniform distribution on (i.e. for every site in L+, 
a symbol is chosen independently with probability 1/\A\ each). 

We consider L+ and L_ as hot and cold parts of the world, respectively. Then, inter- 
esting structure can only start growing at the boundary between hot and cold regions. 
This accounts for the fact that life requires thermal non-equilibrium, which is most 
naturally provided by temperature gradients. 

Such a state ensures the availability of an infinite amount of free memory space. 
- A similar convention would also be required for Solomonoff 's prior if it was defined 
with respect to a reversible Turing machine |42j . Then one would also need to provide 
free memory space for free in order to ensure that the string O'^ obtains a higher prior 
probability than a typical A;-bit string. We now define: 

Definition 13 (physical prior) 

For every time t G N, let Pt be the probability distribution on S that is obtained by 
applying at to the initial mixed state Q, as given by Definitional^ 

Let us discuss some properties of Pt- As opposed to Solmonoff's prior, it depends 
on t. This is because the Turing machine stops for appropriate inputs whereas the 
dynamics of our CA does not. It is not clear whether one should consider this as a 
feature rather than as a drawback of our definition - one may argue that in the early 
stage of the universe other states were more likely than today and others were less 
likely. Note, however, that 

P{c) oc ^*(c)2-^W 
ieNo 

would be an option to define a time-independent prior. To elaborate on this goes 
beyond the scope of this paper, but the additional term K{t) will also appear in our 
definition of physical complexity below. 
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Figure 1: The prior probability of the configuration on the left is the same as for its shifted 
copy (middle) and its rotated copy (right) provided that these transformations preserve the 
partition into and L_. For instance, if is the set with xi > 0, then it is allowed to 
shift along the Xi = axis. In a lattice with d > 3, there are also rotations that preserve 
the axis. 

A second feature of Pt is that the prior probability of a configuration (and also 
the physical complexity that we define below) depends on its location on the lattice: 
creating a cold region in the middle of the hot region involves much more sophisticated 
initialization than creating it close to the boundary to the cold region. In the former 
case, the entropy of the hot region needs to be transported over a long way to the cold 
region. 

Recalling our motivation for defining a prior different from Solomonoff 's, we note 
that Pt indeed respects some of the symmetries of physical laws. Consider, for some 
time t, the probability Pt of the pattern in Fig. [T} left, consisting of symbols 1 and 0. 
The empty squares indicate cells whose value is unspecified. Fig. [T} middle, and right, 
show shifted and rotated copies of the same pattern, respectively. If the shift and the 
rotation are chosen such that they leave L± invariant, then Pt is obviously the same 
for these copies. 

To discuss item 2 in the above list of desired modifications, we assume that the 
generation of some c requires only a short program on a Turing machine but one needs 
to initiale a large region Rp to generate it on a physically universal CA. One reason 
could be that it involves a long and fragile computation process which only outputs 
the correct result if a large environment is correctly initialized. Then Pt{c) would be 
small for all t. 

In the spirit of Solmonoff's prior, we would like to ensure that every c E (for 
an arbitrary finite region R) gets non-zero probability for some t G Nq. Note that the 
maximally mixed state on can be interpreted as a mixture over "random programs" , 
and it is not clear whether programs on L_|_ are sufficient for preparing any desired 
configuration (also on L-). It could be that this defines an even stronger kind of 
physical universality. This problem will also be left open. 

To define a physical analog of Kolmogorov complexity we first discuss why the 
following straightforward definition is inappropriate for our purposes: For any c G 
one could define the complexity of c as the size of the region Rp for which there is a 
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state Cp G such that 

3tGNo at{cp,e)\R = c Ve G . 

Then the complexity of c is at least \R\ because at is a bijection. We want to define 
complexity of a state in such a way that simple patterns like 0^ have low complexity. 
Unfortunately, we are not able to show that this will be the case for the complexity 
measure below, but there is at least no obvious argument why it cannot be (as opposed 
to the above measure). The decisive assumption that we make is that Rp must be 
contained in L_|_. This is in agreement with the fact that our "random programs" that 
define Pt are contained in L_|_ while L_ only contains free memory space. 

Even though we must leave it open, whether every configuration can be prepared 
by programs in (see also the remarks above regarding the physical prior), we now 
define the "program size complexity", but we phrase it more general and define the 
complexity of processes other that state preparation: 

Definition 14 (physical complexity) 

Let R he some region and 

he an arhitrary map. The physical complexity of M is defined hy the minimum 

C{M) ■= min{|iiM| log2 |^| + K{t)} , 

where the minimum, is taken over all t E N and all regions Rm and initializations 
Ci G S^'^ for which 

aticM,Ci)\R„f = M{ci) . 

The physical complexity C(c) of a configuration c is defined by the complexity of the 
map M with M{ci) = c for all Ci G A^. 

The additional term K(t) will later be needed to ensure that our complexity measure 
satisfies Kraft's inequality. A more intuitive justification may be that the time pa- 
rameter must be provided as external information. Since there is probably no finite 
initialization that prepares a state and keeps it forever, we must been told when the de- 
sired state is present or the desired transformation is performed. The following relation 
between physical complexity and the physical prior is almost obvious: 

Lemma 4 (lower bound on physical complexity) 



C(c) > min{- log2 Pt{c) + K{t)} . (8) 
Proof: By definition of the physical prior, 

Ptic) > , (9) 

for all Cp that prepare c after the time t. By definition of physical complexity, 

C(c) = minmin{|i?p| logs 1^1 + K{t)} , 
teNo Cp 
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where the minimum is taken over all Cp that prepare c after the time t. Using ^ we 
obtain 

C(c) > min{- log2 Pt{c) + K{t)] . (10) 

□ 

Rather than having inequality Q only one may wish to show a tighter link between 
the physical prior and physical complexity - in analogy to the tight connection between 
Solomonoff's prior and Kolmogorov complexity [43j : 

Theorem 7 (Coding Theorem of Levin) 

- log m(s) = K{s) + 0{1) , 

where 0(1) means that the error can be bounded by a constant that depends on the 
Turing machine, but does not depend on s. 

Hence, m{s) w 2~-^^^^ up to a multiplicative term that is bounded by some constant. 
For this reason, tighter connections between the physical prior and physical complexity 
are desirable. 

The following theorem describes a mathematical property of physical complexity 
that it shares with Kolmogorov complexity: 

Theorem 8 (Kraft's inequality) 

Let U be a set of mutually exclusive configurations of arbitrary size. Then, physical 
complexity satisfies 

^2-^(^) < 1. 



Proof: Let tc be the time that minimizes the right hand side of (10), hence 

C{c) > -log^PtM + Kitc). 

We conclude 

^2-^(-) < ^Pi^(c)2-^(*^) 

teNo ceu 

where the second last inequality holds because the configurations are mutually exclusive 
and the last step uses the usual Kraft inequality for Kolmogorov complexity. □ 

The fact that Kolmogorov complexity satisfies Kraft's inequality (which was not 
the case in Kolmogorov's version since he did not use prefix codes) made it possible to 
renormalize it to a probability distribution on strings, yielding Solomonoff's prior. 

Although a better understanding of our notion of physical complexity has to be left 
to the future, it is, by construction, clear that it takes into account whether running a 
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process requires to adjust a large part of the environment - even though the process may 
be simple from the point of view of algorithmic information. Such a strong disagreement 
between Kolmogorov complexity and physical complexity occurs e.g. if Rp is large but 
Cp mainly consists of zeros, or some other algorithmically simple pattern. If a physical 
process requires, for instance, cooling a large region (e.g. setting many cells to zero) 
around the system this could formally appear as large physical complexity. 

3 Physical universality in the quantum world 

3.1 Informal description of some differences to the classi- 
cal case 

The main question that arises when we translate the notion of universal state prepa- 
ration into the quantum world is whether the configuration of the environment is sup- 
posed to be a basis state. In other words, we ask whether the preparation of general 
quantum superposition should be reducible to the preparation of basis states in the 
environment. 

On the one hand, it seems to be artificial to select a certain subset of states as 
being more fundamental than others. On the other hand, the following model suggests 
that basis states should be sufficient: we could think of the basis states as states in the 
register of a classical processor that controls a quantum preparation machine. Then 
the register is the region that we act on by changing its classical state only. 

3.2 Defining the problem 

To formally define quantum CAs, we assume that every site x £ L contains a quantum 
system with Hilbert space H := C", where a := \A\ and the basis vectors \j) are 
labelled by symbols j £ A. The Hilbert space of a region R is then given by the tensor 
product of copies of 7i, but to avoid problems with infinite tensor products we follow 
|44j and use an operator algebraic framework [45l 06] : Let every site x be described by 
a copy of the same matrix algebra Ax of a x a matrices. The self-adjoint part of Ax is 
interpreted as the observables corresponding to cite x. For every finite set A G L, let 
be the tensor product 

:= I^Ax- 

For A C A', is considered as subalgebra of ^a' in a canonical way by adding the 
tensor product of an appropriate number of a x a identity matrices. For every infinite 
set A, we define ^a as the C*-completion over the union of algebras of finite regions 
AAf This defines the C*-algebra Al which contains all local algebra^ 

The set S{Al) of states is the set of positive linear functionals (j) : Al — C with 
0(1) = 1. The state space S{Al) is a convex set whose extreme points are called pure 
states, this definition generalizes density operators of rank one to the infinite system. 

^the "quasi-local" algebra [IS] 
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A pure state (j) is said to be a basis state on a region R if it is given by 

(j){a) = tr{pa) Va G Ar , 

where p is a diagonal matrix with diagonal (0, . . . , 0, 1, 0, . . . , 0). A pure state is said 
to be a (global) basis state if its restriction to every finite region is a basis state. 

It is convenient to describe the dynamics in the Heisenberg picture, it is then given 
by a group (at) of C*-automorphisms of Al satisfying the following locality condition: 

(j){Aa:) e Ar , 

for every region R that contains the Moore neighborhood of x with radius one. The 
dynamics transfers the state (p into (p o at- For any observable a G Ar for which 
at{a) G Ar' for some region R', the value {(j) o at){a) is already determined by the 
restriction of (p to R' . Therefore, {p o at){a) is also a well-defined expression if p is a 
state on Ar'. 

The following notion of physical universality can be seen as a quantum analog of 
Definition [2] to the quantum world. As opposed to the set of classical configurations of 
a finite region, the set of pure states is (uncountably) infinite. On the other hand, the 
set of basis states of a region Rp is finite and the ste of all basis states of the whole 
lattice still is countable, we cannot prepare all states on R exactly but at most up to 
any desired accuracy: 

Definition 15 (conditional quantum state preparation) 

A quantum CA is said to allow for conditional state preparation if for every pair of 
states {pi,pf) £ S{Ar) x S{Ar) of a region R and every e > there is a basis state 
7 G S{Ai\r) of the complement and a time t such that 

Pi) oat{a) - Pf{a)\ < e\\a\\ \/a£ Ar, 

where \\.\\ denotes the operator norm. 

It is important to note that the program state 7 is a basis state, i.e., the program 
is classical software. As opposed to the classical case, this notion of universality is not 
satisfied by the "trivial" CA that only shifts the state. Instead, it includes problems 
like how to prepare sophisticated multi-particle entanglement using a given interaction 
via preparing the environment to basis states. We thus formulate the following open 
problem: 

Question 3 (physically universal quantum CA) 

Is there a quantum CA that is physically universal in the sense of Definition \ 1 5|F 

We will not translate Definitions [3] and [4] to the quantum setting since even our 
"weak" form of universality is not obvious to exist for quantum CAs. 
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3.3 Physically universal Hamiltonians 

To account for the fact that time evolutions are actually continuous, we may want 
to switch from CAs to Hamiltonians. In the literature there exists a large number of 
translation invariant finite range Hamiltonians on lattices that are universal for quan- 
tum computing, e.g., |I71 133 HHl EO] , but physical universality has not been considered. 
A characteristic feature of many constructions for computational universal Hamilto- 
nians is the separation between a "program region" and a "data region" where the 
former controls the operations performed on the latter. Physical universality would 
imply that we are also able to operate on the program region, which could require an 
infinite hierarchy of program regions. To formally define physical universality, we can 



straightforwardly adapt Definition 15 by replacing the group {at)tei- with the contin- 
uous version {at)teR- To properly state what it means that a dynamics of an infinite 
lattice is given by a finite range translation invariant Hamiltonian we consider an op- 
erator h E Ar for some region R and define for every vector x G Z'^, the shifted copy 
of h by Tx{h). Then it is known that the differential equation 



^ T^{h),a 



in) 



defines uniquely a group of C*-automorphisms [l6]. Definition 15 and, correspondingly, 
Question [s] then straightforwardly translate to the group at defined by (11). 

The considerations on the thermodynamic costs change more significantly because 
we replace the maximum entropy state by the state of minimum free energy, i.e., the 
Gibbs state (for defining thermal equilibrium states for infinite lattices see |46) ) . which 
ensures that we are getting closer to real physics. We may then even allow for lattices 
having an infinite dimensional algebra at each site. We also want to translate the 



physical prior and the physical complexity in Subsection 2.5 Now, the notion of hot 
and cold parts is taken more literally than above since the definition of Hamiltonians 
allows us to defined thermal states for temperatures other than T = and T = oo. 
Thermal equilibrium states on infinite quantum lattice systems can be defined via 
limits of Gibbs states for finite regions fl6j (we do not care about the potential non- 
uniqueness of limit points here). We restrict these states of the infinite lattice to 
and L-, respectively and "glue" them together to define our initial state: 

Definition 16 (initial state of the universe) 

Let (pT ■ Al ^ C be Gibbs states for temperature T on the entire lattice. For some 
T2 > Ti > 0, let 0+ be the restriction of 4>t2 to Al+ and cj)- the restriction of (pTi to 
Al_- Then we define the "initial state of the universe" by 



Definition 17 (physical prior for Hamiltonian systems) 

For every t we define the mixed state 



•oat. 
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Let IV') be the state vector of some pure state on Ar. Then 

is the probability for obtaining the state after the time t when measuring a non- 

degenerate self-adjoint operator that contains \ip) as one of its eigenvectors. 

In the spirit of Solomonoff 's prior, we would like to give higher prior to states that 
are simple in an intuitive sense than to complex ones. For instance, we would consider 
the basis state |0)(0|^ (i.e., all cells in the region R are in the state |0)(0|) as simple. 
It is possible that a small program makes the Hamiltonian dynamics generating free 
memory space via using the temperature gradient. This is at least not forbidden by 
any obvious thermodynamic laws. Thermodynamics also allows for processes that use 
the existing temperature gradient to either lower the temperature of some region in 
L_ (refrigerator driven by a heat engine, see also [23]) or increase the temperature of 
-L+ even further. The size of the program required to make at implementing such a 
process would then be the physical complexity of the process. This is only meant to be 
one of many examples how physically universal CAs define the complexity of physical 
processes, no matter whether they are computation processes or not. 

An interesting modification of the above would be given by replacing the lattice 
with a field-theoretic model, where nets of subalgebras are assigned to regions in 
M"^ [5l] and define physical universality for a field theory. As opposed to the discrete 
model, this would allow for the definition of an even "more physical" prior that is 
invariant under the full Lorentz group. 

4 Conclusions 

The main contribution of this paper is to introduce and motivate the concept of phys- 
ically universal CAs and Hamiltonians. Their non-existence would probably have in- 
teresting consequences for the limits of controlling microscopic systems. But also their 
existence poses questions that are equally fundamental, because such CAs are nice 
models for studying the thermodynamic cost of computation and control. 

We also use physically universal CAs to define the complexity of states and a 
corresponding prior probability that is considered as a physically motivated analog of 
Solomonoff's prior. An interesting feature of this prior is that it is invariant under 
some physical symmetries. Moreover, it tries to capture the amount of adjustments 
that is needed in the environment to run a preparation process, which includes also 
the cost of removing disturbing heat and the cost of keeping it away from the system 
during the implementation of the process. 

The author would like to thank Bastian Steudel and David Balduzzi for helpful 
comments on an earlier draft and Aram Harrow and Armen Allahverdyan for interesting 
discussions. This work has partially been supported by the VW-project "Quantum 
Thermodynamics: energy and information fiow at nanoscale". 
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